ATS2 notes
2022-01-17 · 11 min read
Syntax #
abstype define an abstract type
vtypedef
viewtypedef define a viewtype
typedef define binding b/w name and sort (eff. an alias)
stadef define _static_ binding b/w name and sort
fun foo(..): res = "#mac"
compile to C function without name mangling? often
seen with `extern` before `fun`. is the `extern`
necessary?
lam (..): res => expr
an anonymous lambda function
fix foo(..): res => val
a recursive, anonymous function named `foo`
{...} universal quantification
[...] existential quantification
(... | ...) (proof | value)
@(_, _, ...) flat tuple
.<...>. termination metric
@[T][N] flat array of N values of type T
@[T][N]() array instance
@[T][N](x) array initialized with N x's
sortdef nat = {a: int | a >= 0}
type a sort for w/ machine word size types (boxed types)
t@ype a sort for unspecified length types (unboxed types)
viewtype a sort for a linear view + type sort
viewt@ype a sort for a linear view + unboxed t@ype sort
staload #
Static load: creates a namespace containing loaded package
staload FOO = "foo.sats"
val a: $FOO.foo_t = $FOO.A()
(* like `use foo::*;` in rust *)
staload = "foo.sats"
val a: foo_t = A()
(* only load package; useful for pulling in DATS file containing *)
(* implementations of forward declared fns/types/etc... *)
staload _ = "bar.dats"
dynload #
TL;DR: if you get a link-time error about some undefined reference to a variable ending in __dynloadflag, you probably need a dynload <pkg>.dats somewhere.
Libraries #
$PATSHOME/prelude: prelude library targetting C codegen$PATSHOME/libats: libats library targetting C codegen
If a function is implemented as a template, the ATS compiler needs to access the source of the function implementation in order to generate code for each instance call of the function.
This is why we #include "share/atspre_staload.hats", which is basically a list of #staload _ = "prelude/DATS/<prelude-file>.dats", pulling in all the template function sources.
Mutual Recursion #
iseven and isodd are mutually recursive; you need the and keyword before isodd so that it's visible inside isevn
fun isevn (n: int): bool =
if n > 0 then isodd (n-1) else true
and isodd (n: int): bool =
if n > 0 then isevn (n-1) else false
ATS won't be able to tail-call optimize the above as-is. instead you need to replace fun with fxn to indicate that the functions need to be combined. Then ATS will place a copy of isodd inside isevn.
Tail-call Optimization #
A classic list_append that is not tail-recursive
fun {a: t@ype} list_append {m, n: nat} (
xs: list(a, m),
ys: list(a, n)
) : list(a, m + n) =
case+ xs of
| list_nil() => ys
| list_cons(x, xs) => list_cons(x, list_append(xs, ys))
Translating to a tail-recursive apparently requires "separating" the allocation of the new list element from initiailizing it with x and (inductive step).
fun {a: t@ype} list_append2 {m, n: nat} (
xs: list(a, m),
ys: list(a, n)
) : list(a, m+n) =
let (* tail recursive inner loop *)
fun loop {m: nat} (
xs: list(a, m),
ys: list(a, n),
res: &ptr? >> list(a, m+n)
) : void =
case+ xs of
| list_nil() => res := ys
| list_cons(x, xs1) =>
let (* allocate a list node with an _uninitialized_ *)
(* tail (hole) *)
val () = res := list_cons{a}{0}(x, _)
(* pattern match on res to grab a handle to the *)
(* tail *)
val+list_cons(_, res1) = res
(* recurse, which places xs1 || ys into res1, *)
(* which means res now contains the full xs || ys *)
val () = loop(xs1, ys, res1)
in (* folding translates into a no-op at run-time *)
fold@(res)
end
(* uninitialized result (stack allocated?) *)
var res: ptr
val () = loop (xs, ys, res)
in res
end
This technique is called "tail-recursion modulo allocation"
// TODO: wtf is fold@(_)???
Function Types #
Two kinds of functions in ATS (like Rust), env-less and closure functions.
- env-less: like a function ptr; doesn't capture any variables in the scope/environment
- closure: a function ptr + a captured environment (represented as a tuple of values apparently)
Syntax #
- env-less:
(int) -<fun1> void(int) -> void(shorthand)
- closure:
(int) -<cloref1> voidcfun(int, void)(shorthand) (fromlibats/ML)
Operators #
Can refer to operator function w/o infix via op keyword, e.g., op+(1,2) vs 1 + 2.
Local Bindings #
Let #
Introduces bindings for use in subsequent bindings and inside the final in .. end expression
val area =
let
val pi = 3.1415
val radius = 10.0
in
pi * radius * radius
end
Where #
Introduces bindings inside the expression immediately before where
val area = pi * radius * radius where {
val pi = 3.1415
val radius = 10.0
}
Local #
Forms a sequence of declarations rather than an expression at the end
local
val pi = 3.1415
val radius = 10.0
in
(* visible in the containing scope *)
val area = pi * radius * radius
end
On Types #
- Rather than viewing a type as the set of values it classifies, consider a type as a set of static, semantics meanings.
- an expression that can be assigned type T has the set of semantics implied by that type.
- an expression is "well-typed" if there exists a type T that can be assigned to the expression
- In other words, there can be many types and many different and possibly disjoint static semantics assignable to an expression.
- In this way, choosing a type for an expression is a statement of which static semantics are important or useful in the current context.
Property (Preservation): Consider an expression expr0 which can be assigned type T. If expr1 := eval(expr0), then expr1 can also be assigned type T. In other words, static type meaning is preserved under evaluation.
Property (Progress): Evaluation of an expression always "makes progress" towards a value. In other words, calling eval on an expression (and then on the result and so on) should eventually become a value.
I Imagine progress is necessary for the compiler to prove that certain inductive types or recursive functions terminate or become structurally simpler with every inductive step.
Tuples #
(* unboxed, flat tuple *)
val xy = (123, "abc")
val x = xy.0 and y = xy.1
(* boxed tuple (ptr sized) *)
val zw = '( -1.0, "foo")
val z = zw.0 and w = zw.1
Note that the space b/w '( and -1.0 is required for boxed tuples. Tuples are also sometimes written with @ in front like @("foo", 123)
Records #
(* unboxed record *)
typedef point2D = @{
x = double,
y = double
}
val origin = @{ x = 0.0, y = 0.0 } : point2D
val x = origin.x and y = origin.y
val @{ x = x, y = y } = origin
val @{ y = y, ... } = origin
(* boxed record *)
typedef point2D = '{
x = double,
y = double
}
Sequence Expressions #
(
print 'Hi'; (* return value must be void *)
123 + 456 (* result of expression is result of whole block *)
) (* has type int *)
(* can also use begin .. end *)
begin
print 'Hi';
123 + 456
end
(* or let .. in .. end *)
let
val () = print 'Hi'
val () = print 'asdf'
in
123 + 456
end
Datatypes #
datatypes == garbage collectable objects?
seem to be auto boxed
Templates #
templates are bound via first-find rather than best-find
Termination Metric #
.<expr>. in function signature tells that compiler that expr must be strictly decreasing on each function call
Case Expressions #
TL;DR: prefer case+ and val+ in almost all cases to mandate exhaustive pattern matches everywhere.
dataviewtype int_list_vt =
| nil_vt of ()
| cons_vt of (int, int_list_vt)
// xs: int_list_vt
// ~ is for freeing linear int_list_vt resource
case xs of
| ~nil_vt() => ..
| ~cons_vt(_, xs) => ..
// just plain `case` with non-exhaustive pattern
// ==> WARNING: pattern match is non-exhaustive
case xs of
| ~nil_vt() => ..
// `case+` with non-exhaustive pattern
// ==> ERROR: pattern match is non-exhaustive
case+ xs of
| ~nil_vt() => ..
// `case-` with non-exhaustive pattern means:
// "ignore this problem compiler, I know what I'm doing : )"
case- xs of
| ~nil_vt() => ..
// you can also add pattern guard conditions
case xs of
| ~cons_vt(x, xs) when x < 10 => ..
One detail to note is that while matching happens sequentially at runtime, typechecking of case clauses actually happens non-deterministically.
You can force a pattern to typecheck sequentially by using =>>. For example, this won't typecheck w/o using =>>:
fun {T, U: t@ype} {V: t@ype}
list_zipwith {n: nat} (
xs1: list (T, n),
xs2: list (U, n),
f: (T, U) -<cloref1> V
) : list (V, n) =
case+ (xs1, xs2) of
| (list_cons (x1, xs1), list_cons (x2, xs2)) =>
list_cons (f (x1, x2), list_zipwith (xs1, xs2, f))
| (_, _) =>> list_nil ()
Mutate In-place (fold@) #
Let's say we're matching on a linear list list_vt and want to modify the entry in-place. We don't to be free'ing and alloc'ing garbage cons nodes just to change the entry.
(ERROR) For example, we might try to do this first:
fun list_vt_add1 {n: nat} (xs: !list_vt(int, n)): void =
case+ xs of
| cons_vt(x, xs1) =>
let
// ERROR: an l-value is required
val () = x := x + 1
in
// ..
end
| nil_vt() => // ..
Since x and xs1 are only treadted as values when we use a plain pattern (i.e., not an l-value) and are not allowed to be modified into other values or types.
(CORRECT) Instead, the right syntax here looks like this:
fun list_vt_add1 {n: nat} (xs: !list_vt(int, n)): void =
case+ xs of
// .----- this is important
// v
| @cons_vt(x, xs1) =>
let
// CORRECT
val () = x := x + 1
prval () = fold@(xs) // <--- this is important
in
// ..
end
| nil_vt() => // ..
The @ in the pattern unfolds the dataviewtype variant, which binds x and xs1 to pointers to each field. This @ binding also implicitly view-changes xs to viewtype list_vt_cons_unfold(l_0, l_1, l_2). These *_unfold(l_0, ..) viewtype definitions an autogenerated for all dataviewtype variants, where l_0 points to the outer dataviewtype and l_i points to field_i, where field_1 is the first field.
After modifying x in the above let-block, notice that we also need this fold@(xs) proof statement. This built-in proof function "folds" the xs: list_vt_cons_unfold(l_0, l_1, l_2) back into a list_vt.
Val Expressions #
Just like case/case+/case-, you can pattern match in val expressions with exhaustiveness or not.
// ==> WARNING: non-exhaustive
val ~cons_vt(x, xs1) = xs
// ==> ERROR: non-exhaustive
val+~const_vt(x, xs1) = xs
// silent : )
val-~cons_vt(x, xs1) = xs
Templates #
templates are monomorphized at compile time
fun {a,b: t@ype} swap(xy: (a, b)): (b, a) =
(xy.1, xy.0)
Note: the template parameters come before the function name
{a,b: t@ype} are the template parameters
t@ype: a sort for static terms representing size-unspecified (== unsized?) types type: machine word-sized types (usually boxed types)
Polymorphic Functions/Datatypes #
polymorphic functions/datatypes allow for dynamic dispatch. obv. polymorphic functions/datatypes are not monomorphized.
fun swap_boxed {T: type}{U: type} (xy: (T, U)): (U, T) =
(xy.1, xy.0)
Note: the polymorphic parameters come after the function name
Note: if we used t@ype here, we would get a C compiler error (not a typechecking error!) since the arguments have unspecified size.
References #
references in ATS are just heap allocated boxed values. they're really only useful in GC'd programs or as top-level program state since you need to manually free them (no linearity assistance).
the key interface is
val int_ref = ref<int>(123)
val () = !int_ref := 69
val () = println!("int_ref = ", !int_ref)
// > int_ref = 69
Arrays #
There are a few "variants" of arrays
array: (TODO) static stack allocated array?array_v: (TODO) array view?arrayptr: (TODO): linear?arrayref: GC'd array reference (w/o size attached)arrayszref: GC'd array reference (with size attached, like a fat pointer)
usage:
val size = g0int2uint_int_size(10)
val xs = arrszref_make_elt(size, 42069)
val x_3 = xs[3]
val () = xs[4] := 123
Matrices #
Note: row-major layout! (rather than column-major like C)
matrixptrmatrixrefmtrxszref
postfix sz
#define sz(x) g0int2uint_int_size(x)
val X = mtrxszref_make_elt(10sz (*nrow*), 5sz (*ncol*), 0 (*fill*))
val X_ij = X[i,j]
val () = X[i,j] := 123
Abstract Types #
Types w/ encapsulation, i.e., structure is not visible to the user
abstype: boxed abstract type (pointer-sized)
// declares abstract boxed intset type of size ptr
abstype intset = ptr
abst@ype: unboxed abstract type
// declares abstract unboxed rational type of size 2*sizeof(int)
abst@ype rational = (int, int)
Use assume to "concretize" the abstract type. This can happen at-most-once, globally.
assume my_abstract_type = my_concrete_type
Theorem Proving #
- ATS/LF component supports interactive theorem proving
prop: a builtin sort representing all proof types dataprop: a prop type declared in a manner similar to datatype defs.
A dataprop representing the fibonnacci relation fib(n) = r
dataprop FIB(int, int) =
| FIB0(0, 0) of ()
| FIB1(1, 1) of ()
| {n: nat} {r0, r1: int} FIB2(n+2, r0+r1) of (FIB(n, r0), FIB(n+1, r1))
The sort assigned to FIB is (int, int) -> prop, i.e., FIB takes two ints and returns a proof. FIB0 and FIB1 constructors both take () and return a proof.
A dataprop for multiplication:
// int * int = int
dataprop MUL(int, int, int) =
// 0 * n = 0
| {n: int} MULbase(0, n, 0) of ()
// (m+1) * n = m*n + n
| {m: nat}{n: int}{p: int} MULind(m+1, n, p+n) of MUL(m, n, p)
// (-m) * n = -(m * n)
| {m: pos}{n: int}{p: int} MULneg(~(m), n, ~(p)) of MUL(m, n, p)
To prove this relation, we want to construct values of MUL for each parameter. The most common way to do this is to construct a total function over all parameters:
// ∀ m ∈ ℕ, n ∈ ℤ, ∃ p ∈ ℤ : m * n = p
prfun mul_nat_is_total {m: nat; n: int} .<m>. (): [p: int] MUL(m, n, p) =
sif m == 0 then
MULbas()
else
MULind(mul_nat_is_total{m-1, n}())
// ∀ m, n ∈ ℤ, ∃ p ∈ ℤ : m * n = p
prfn mul_is_total {m, n: int} (): [p: int] MUL(m, n, p) =
sif m >= 0 then
mul_nat_is_total{m, n}()
else
MULneg(mul_nat_is_total{~m, n}())
Note: the second proof function uses prfn rather than prfun for a specific reason: prfuns define recursive proof functions and thus require a termination metric. prfns must be non-recursive and therefore don't need a termination metric.
datasort: like a datatype but static. doesn't support any dependent type stuff i think. you must also use scase instead of case to match on datasorts.
Q: what is the difference b/w a dataprop and a datasort??
Views #
A view is a linear prop (linear as in must consume exactly once). views power ATS's pointer ownership checking.
For example, an owned pointer with a view to an a at location l looks like
{a: t@ype} {l: addr} (pf: a @ l | p: ptr l)
or a maybe uninitialized a
(pf: a? @ l | p: ptr l)
While managing these linear views can be done by threading the proofs through the in-params and out-params of each function, ATS provides a convenient syntactic sugar for modifying the view in-place. Consider the ptr_set function:
fun {a: t@ype} ptr_set {l: addr} (
pf: !a? @ l >> a @ l
| p: ptr l,
x: a
): void
We can read the pf as "the function takes a proof of a maybe-uninit a at location l and will modify this proof (in the caller's context) to a definitely-init a at location l".
In general, the format looks like !V >> V' to mean that the function changes the view V to V' upon its return. If V = V', then you can just write !V or !V >> _ (TODO: what does this mean? does typechecker fill the hole?).
Manipulating Pointers #
Manually pass the proof in and out:
// (pf: a @ l | p: ptr l)
val x = ptr_get<int>(pf | p)
val () = ptr_set<int>(pf | p, 123)
Or use the pointer syntax:
// these modify pf in-place
val x = !p
val () = !p := 123
Viewtypes #
viewtypes are linear types in ATS. The name view + type suggests that a viewtype consists of a linear view V and type T. Usually we refer to viewtypes by their definition VT but you can actually split viewtypes into their component view V and type T (via pattern matching.)
This interpretation of a
VTas a(V | T)could be thought of as encumbering a concrete, runtime value of typeTwith a linear tagV(which can also assert other properties ofc).Given the viewtype
VTwe can compactly refer to the type portion by the syntaxVT?!viewtypeandviewt@ypeare two sorts with the type portion in sortstypeandt@yperesp.
http://blog.vmchale.com/article/ats-linear https://bluishcoder.co.nz/2011/04/25/sharing-linear-resources-in-ats.html
L-Values and Call-by-Reference #
- An l-value is a pointer and linear at-view proof / "value + address"
- Like C/C++, l-values can appear on the left-hand side of assignments
- Simplest l-value is
!pwherepis aptr l. the typechecker must also find an at-view proofa @ lsomewhere in the context. - Normally, ATS uses call-by-value; however, you can pass l-values by reference. When this happens, the address (and not the value) is passed.
- Function can take reference with
&likefun foo(x: &a). Note that you don't use!to deref or assign, just the plainx.
Stack-Allocated Variables #
- Use
var x: Tto declare uninit stack variable orvar y: T = valto init withval. - Use
view@(x)to get the view-type at the location insidex - Use
addr@(x)to get the location thatxpoints to. - Interestingly, you can see the the uninit decl. actually has type
view@(x) = T? @ xwhile the init decl. has typeview@(y) = T @ x var z: int with pfz = 69will bindpfz = view@(z)for you- ATS ensures there are no dangling or escaped stack pointers by requiring that the linear at-view proof must be present in the context when the function returns.
- Can also stack allocate flat arrays in the obvious way
var A = @[double][10]() // uninit stack array
var B = @[double][10](42.0) // size=10 stack array init w/ 42.0
var C = @[double](1.0, 2.0, 3.0) // size=3 init w/ (1.0, ..)
- Can also stack allocate closures via
lam@(non-rec) andfix@(rec)
var x: int = 123
var y: int = 69
var mul = lam@ (): int => x * y
Variance (Rust) #
The variance property of a generic type is relative to its generic arguments. For example, a generic type F has variance relative to its generic parameters (say F<T>, so F has variance relative to T). Variance tells us how/how not e.g. traits impls "pass through" the inner parameter type T to/from the outer generic type F.
subtype: T: U, T is "at least as useful as" U. For example, 'static: 'a since you can use static lifetimes anywhere some stricter lifetime 'a is needed.
There are three cases for variance: covariant, contravariant, and invariant.
covariant:
have
fn foo(&'a str), can callfoo(&'static str). we can providefoowith a stronger type than it requires. in this case,F<_> = &'_ str.contravariant:
have
fn foo(Fn(&'a str) -> ()), can't providefoo(fn(&'static str) {}). need to providefoowith a weaker type than it expects.invariant:
no additional subtyping relation can be derived
Variance (ATS) #
You can express covariance and contravariance (I believe?) in some parameter by adding a + or - to certain keyword types or something
Confirmed this is true for view+ and prop+ for covariance. Just guessing on the rest.
prop+ prop-
view+ view-
type+ type-
tbox+ tbox-
tflt+ tflt-
types+ types-
vtype+ vtype-
vtbox+ vtbox-
vtflt+ vtflt-
t0ype+ t0ype-
vt0ype+ vt0ype-
viewtype+ viewtype-
viewt0ype+ viewt0ype-
opt #
There's a type constructor opt(T, B) where T is a type and B is a bool. If B = true then opt(T, B) is T and T? if B = false.
llam (..) => .. appears to be a linear lambda closure?
(..) -<lin,prf> .. appears to be type of linear, prfun closure?